Prepared By Jayoda Kulatunga Undergraduate Student Pearson HND in Computing - Software Engineering ESOFT Metro Campus |
Where is the Matplotlib Codebase? Set Font Properties for Title and Labels Specify Which Grid Lines to Display Set Line Properties for the Grid |
(Click here to go back to the menu)
Matplotlib is a low level graph plotting library in python that serves as a visualization utility.
Matplotlib was created by John D. Hunter.
Matplotlib is open source and we can use it freely.
Matplotlib is mostly written in python, a few segments are written in C, Objective-C and Javascript for Platform compatibility.
The source code for Matplotlib is located at this github repository https://github.com/matplotlib/matplotlib
(Click here to go back to the menu)
If you have Python and PIP already installed on a system, then installation of Matplotlib is very easy.
Install it using this command:
C:\Users\Your Name>pip install matplotlib
If this command fails, then use a python distribution that already has Matplotlib installed, like Anaconda, Spyder etc.
Once Matplotlib is installed, import it in your applications by adding the import module statement:
import matplotlib
Now Matplotlib is imported and ready to use:
The version string is stored under __version__ attribute.
(Click here to go back to the menu)
Most of the Matplotlib utilities lies under the pyplot submodule, and are usually imported under the plt alias:
import matplotlib.pyplot as plt
Now the Pyplot package can be referred to as plt.
(Click here to go back to the menu)
The plot() function is used to draw points (markers) in a diagram.
By default, the plot() function draws a line from point to point.
The function takes parameters for specifying points in the diagram.
Parameter 1 is an array containing the points on the x-axis.
Parameter 2 is an array containing the points on the y-axis.
If we need to plot a line from (1, 3) to (8, 10), we have to pass two arrays [1, 8] and [3, 10] to the plot function.
The x-axis is the horizontal axis.
The y-axis is the vertical axis.
To plot only the markers, you can use shortcut string notation parameter 'o', which means 'rings'.
You can plot as many points as you like, just make sure you have the same number of points in both axis.
If we do not specify the points on the x-axis, they will get the default values 0, 1, 2, 3 etc., depending on the length of the y-points.
So, if we take the same example as above, and leave out the x-points, the diagram will look like this:
The x-points in the example above are [0, 1, 2, 3, 4, 5].
(Click here to go back to the menu)
You can use the keyword argument marker to emphasize each point with a specified marker:
You can choose any of these markers:
Marker | Description | |
'o' | Circle | |
'*' | Star | |
'.' | Point | |
',' | Pixel | |
'x' | X | |
'X' | X (filled) | |
'+' | Plus | |
'P' | Plus (filled) | |
's' | Square | |
'D' | Diamond | |
'd' | Diamond (thin) | |
'p' | Pentagon | |
'H' | Hexagon | |
'h' | Hexagon | |
'v' | Triangle Down | |
'^' | Triangle Up | |
'<' | Triangle Left | |
'>' | Triangle Right | |
'1' | Tri Down | |
'2' | Tri Up | |
'3' | Tri Left | |
'4' | Tri Right | |
'|' | Vline | |
'_' | Hline | |
You can also use the shortcut string notation parameter to specify the marker.
This parameter is also called fmt, and is written with this syntax:
Marker|line|color
The marker value can be anything from the Marker Reference above.
The line value can be one of the following:
Line Syntax | Description | |
'-' | Solid line | |
':' | Dotted line | |
'--' | Dashed line | |
'-.' | Dashed/dotted line | |
Note: If you leave out the line value in the fmt parameter, no line will be plotted.
The short color value can be one of the following:
Color Syntax | Description | |
'r' | Red | |
'g' | Green | |
'b' | Blue | |
'c' | Cyan | |
'm' | Magenta | |
'y' | Yellow | |
'k' | Black | |
'w' | White | |
You can use the keyword argument markersize or the shorter version, ms to set the size of the markers:
You can use the keyword argument markeredgecolor or the shorter mec to set the color of the edge of the markers:
You can use the keyword argument markerfacecolor or the shorter mfc to set the color inside the edge of the markers:
Use both the mec and mfc arguments to color the entire marker:
You can also use Hexadecimal color values:
Or any of the 140 supported color names.
(Click here to go back to the menu)
You can use the keyword argument linestyle, or shorter ls, to change the style of the plotted line:
The line style can be written in a shorter syntax:
linestyle can be written as ls.
dotted can be written as :.
dashed can be written as --.
You can choose any of these styles:
Style | Or | |
'solid' (default) | '-' | |
'dotted' | ':' | |
'dashed' | '--' | |
'dashdot' | '-.' | |
'None' | '' or ' ' | |
You can use the keyword argument color or the shorter c to set the color of the line:
You can also use Hexadecimal color values:
Or any of the 140 supported color names.
You can use the keyword argument linewidth or the shorter lw to change the width of the line.
The value is a floating number, in points:
You can plot as many lines as you like by simply adding more plt.plot() functions:
You can also plot many lines by adding the points for the x- and y-axis for each line in the same plt.plot() function.
(In the examples above we only specified the points on the y-axis, meaning that the points on the x-axis got the the default values (0, 1, 2, 3).)
The x- and y- values come in pairs:
(Click here to go back to the menu)
With Pyplot, you can use the xlabel() and ylabel() functions to set a label for the x- and y-axis.
With Pyplot, you can use the title() function to set a title for the plot.
You can use the fontdict parameter in xlabel(), ylabel(), and title() to set font properties for the title and labels.
You can use the loc parameter in title() to position the title.
Legal values are: 'left', 'right', and 'center'. Default value is 'center'.
(Click here to go back to the menu)
With Pyplot, you can use the grid() function to add grid lines to the plot.
You can use the axis parameter in the grid() function to specify which grid lines to display.
Legal values are: 'x', 'y', and 'both'. Default value is 'both'.
You can also set the line properties of the grid, like this: grid(color = 'color', linestyle = 'linestyle', linewidth = number).
(Click here to go back to the menu)
With the subplot() function you can draw multiple plots in one figure:
Example
Draw 2 plots:
The subplot() function takes three arguments that describes the layout of the figure.
The layout is organized in rows and columns, which are represented by the first and second argument.
The third argument represents the index of the current plot.
So, if we want a figure with 2 rows an 1 column (meaning that the two plots will be displayed on top of each other instead of side-by-side), we can write the syntax like this:
You can draw as many plots you like on one figure, just descibe the number of rows, columns, and the index of the plot.
Draw 6 plots:
You can add a title to each plot with the title() function:
2 plots, with titles:
You can add a title to the entire figure with the suptitle() function:
Add a title for the entire figure:
(Click here to go back to the menu)
With Pyplot, you can use the scatter() function to draw a scatter plot.
The scatter() function plots one dot for each observation. It needs two arrays of the same length, one for the values of the x-axis, and one for values on the y-axis:
The observation in the example above is the result of 13 cars passing by.
The X-axis shows how old the car is.
The Y-axis shows the speed of the car when it passes.
Are there any relationships between the observations?
It seems that the newer the car, the faster it drives, but that could be a coincidence, after all we only registered 13 cars.
In the example above, there seems to be a relationship between speed and age, but what if we plot the observations from another day as well? Will the scatter plot tell us something else?
Note: The two plots are plotted with two different colors, by default blue and orange, you will learn how to change colors later in this chapter.
By comparing the two plots, I think it is safe to say that they both gives us the same conclusion: the newer the car, the faster it drives.
You can set your own color for each scatter plot with the color or the c argument:
You can even set a specific color for each dot by using an array of colors as value for the c argument:
Note: You cannot use the color argument for this, only the c argument.
Example
Set your own color of the markers:
The Matplotlib module has a number of available colormaps.
A colormap is like a list of colors, where each color has a value that ranges from 0 to 100.
Here is an example of a colormap:
This colormap is called 'viridis' and as you can see it ranges from 0, which is a purple color, up to 100, which is a yellow color.
You can specify the colormap with the keyword argument cmap with the value of the colormap, in this case 'viridis' which is one of the built-in colormaps available in Matplotlib.
In addition you have to create an array with values (from 0 to 100), one value for each point in the scatter plot:
You can include the colormap in the drawing by including the plt.colorbar() statement:
Name |
| Reverse | ||
Accent |
| Accent_r | ||
Blues |
| Blues_r | ||
BrBG |
| BrBG_r | ||
BuGn |
| BuGn_r | ||
BuPu |
| BuPu_r | ||
CMRmap |
| CMRmap_r | ||
Dark2 |
| Dark2_r | ||
GnBu |
| GnBu_r | ||
Greens |
| Greens_r | ||
Greys |
| Greys_r | ||
OrRd |
| OrRd_r | ||
Oranges |
| Oranges_r | ||
PRGn |
| PRGn_r | ||
Paired |
| Paired_r | ||
Pastel1 |
| Pastel1_r | ||
Pastel2 |
| Pastel2_r | ||
PiYG |
| PiYG_r | ||
PuBu |
| PuBu_r | ||
PuBuGn |
| PuBuGn_r | ||
PuOr |
| PuOr_r | ||
PuRd |
| PuRd_r | ||
Purples |
| Purples_r | ||
RdBu |
| RdBu_r | ||
RdGy |
| RdGy_r | ||
RdPu |
| RdPu_r | ||
RdYlBu |
| RdYlBu_r | ||
RdYlGn |
| RdYlGn_r | ||
Reds |
| Reds_r | ||
Set1 |
| Set1_r | ||
Set2 |
| Set2_r | ||
Set3 |
| Set3_r | ||
Spectral |
| Spectral_r | ||
Wistia |
| Wistia_r | ||
YlGn |
| YlGn_r | ||
YlGnBu |
| YlGnBu_r | ||
YlOrBr |
| YlOrBr_r | ||
YlOrRd |
| YlOrRd_r | ||
afmhot |
| afmhot_r | ||
autumn |
| autumn_r | ||
binary |
| binary_r | ||
bone |
| bone_r | ||
brg |
| brg_r | ||
bwr |
| bwr_r | ||
cividis |
| cividis_r | ||
cool |
| cool_r | ||
coolwarm |
| coolwarm_r | ||
copper |
| copper_r | ||
cubehelix |
| cubehelix_r | ||
flag |
| flag_r | ||
gist_earth |
| gist_earth_r | ||
gist_gray |
| gist_gray_r | ||
gist_heat |
| gist_heat_r | ||
gist_ncar |
| gist_ncar_r | ||
gist_rainbow |
| gist_rainbow_r | ||
gist_stern |
| gist_stern_r | ||
gist_yarg |
| gist_yarg_r | ||
gnuplot |
| gnuplot_r | ||
gnuplot2 |
| gnuplot2_r | ||
gray |
| gray_r | ||
hot |
| hot_r | ||
hsv |
| hsv_r | ||
inferno |
| inferno_r | ||
jet |
| jet_r | ||
magma |
| magma_r | ||
nipy_spectral |
| nipy_spectral_r | ||
ocean |
| ocean_r | ||
pink |
| pink_r | ||
plasma |
| plasma_r | ||
prism |
| prism_r | ||
rainbow |
| rainbow_r | ||
seismic |
| seismic_r | ||
spring |
| spring_r | ||
summer |
| summer_r | ||
tab10 |
| tab10_r | ||
tab20 |
| tab20_r | ||
tab20b |
| tab20b_r | ||
tab20c |
| tab20c_r | ||
terrain |
| terrain_r | ||
twilight |
| twilight_r | ||
twilight_shifted |
| twilight_shifted_r | ||
viridis |
| viridis_r | ||
winter |
| winter_r |
You can change the size of the dots with the s argument.
Just like colors, make sure the array for sizes has the same length as the arrays for the x- and y-axis:
You can adjust the transparency of the dots with the alpha argument.
Just like colors, make sure the array for sizes has the same length as the arrays for the x- and y-axis:
Result:
You can combine a colormap with different sizes of the dots. This is best visualized if the dots are transparent:
Result:
(Click here to go back to the menu)
With Pyplot, you can use the bar() function to draw bar graphs:
The bar() function takes arguments that describes the layout of the bars.
The categories and their values represented by the first and second argument as arrays.
If you want the bars to be displayed horizontally instead of vertically, use the barh() function:
The bar() and barh() take the keyword argument color to set the color of the bars:
You can use any of the 140 supported color names.
Or you can use Hexadecimal color values:
The bar() takes the keyword argument width to set the width of the bars:
The default width value is 0.8
Note: For horizontal bars, use height instead of width.
The barh() takes the keyword argument height to set the height of the bars:
The default height value is 0.8
(Click here to go back to the menu)
A histogram is a graph showing frequency distributions.
It is a graph showing the number of observations within each given interval.
Example: Say you ask for the height of 250 people, you might end up with a histogram like this:
You can read from the histogram that there are approximately:
2 people from 140 to 145cm
5 people from 145 to 150cm
15 people from 151 to 156cm
31 people from 157 to 162cm
46 people from 163 to 168cm
53 people from 168 to 173cm
45 people from 173 to 178cm
28 people from 179 to 184cm
21 people from 185 to 190cm
4 people from 190 to 195cm
In Matplotlib, we use the hist() function to create histograms.
The hist() function will use an array of numbers to create a histogram, the array is sent into the function as an argument.
For simplicity we use NumPy to randomly generate an array with 250 values, where the values will concentrate around 170, and the standard deviation is 10. Learn more about Normal Data Distribution in our Machine Learning Tutorial.
This will generate a random result, and could look like this:
The hist() function will read the array and produce a histogram:
Result:
(Click here to go back to the menu)
With Pyplot, you can use the pie() function to draw pie charts:
Result:
As you can see the pie chart draws one piece (called a wedge) for each value in the array (in this case [35, 25, 25, 15]).
By default the plotting of the first wedge starts from the x-axis and moves counterclockwise:
Note: The size of each wedge is determined by comparing the value with all the other values, by using this formula:
The value divided by the sum of all values: x/sum(x)
Add labels to the pie chart with the labels parameter.
The labels parameter must be an array with one label for each wedge:
Result:
As mentioned the default start angle is at the x-axis, but you can change the start angle by specifying a startangle parameter.
The startangle parameter is defined with an angle in degrees, default angle is 0:
Result:
Maybe you want one of the wedges to stand out? The explode parameter allows you to do that.
The explode parameter, if specified, and not None, must be an array with one value for each wedge.
Each value represents how far from the center each wedge is displayed:
Result:
Add a shadow to the pie chart by setting the shadows parameter to True:
Result:
You can set the color of each wedge with the colors parameter.
The colors parameter, if specified, must be an array with one value for each wedge:
Result:
You can use Hexadecimal color values, any of the 140 supported color names, or one of these shortcuts:
'r' - Red
'g' - Green
'b' - Blue
'c' - Cyan
'm' - Magenta
'y' - Yellow
'k' - Black
'w' - White
To add a list of explanation for each wedge, use the legend() function:
Result:
To add a header to the legend, add the title parameter to the legend function.
Result:
w3schools.com (n.d.). Matplotlib Tutorial. [online] www.w3schools.com. Available at: https://www.w3schools.com/python/matplotlib_intro.asp.